Vanishing gradient problem

2024-10-21

The tendency for the gradients of early hidden layers of some DNNs to become surprisingly flat (low). Increasingly lower gradients result in increasingly smaller changes to the weights on neurons in a DNN, leading to little or no learning. Models suffering from the vanishing gradient problem become difficult or impossible to train. LSTM cells address this issue.¹

Footnotes

developers.google.com/machine-learning/glossary#vanishing-gradient-problem ↩

Vanishing gradient problem

See also

Footnotes